Detection methods for non-Gaussian gravitational wave stochastic backgrounds 
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A gravitational wave stochastic background can be produced by a collection of independent grav- 
itational wave events. There are two classes of such backgrounds, one for which the ratio of the 
average time between events to the average duration of an event is small (i.e., many events are on 
at once), and one for which the ratio is large. In the first case the signal is continuous, sounds 
something like a constant hiss, and has a Gaussian probability distribution. In the second case, the 
discontinuous or intermittent signal sounds something like popcorn popping, and is described by 
a non-Gaussian probability distribution. In this paper we address the issue of finding an optimal 
detection method for such a non- Gaussian background. As a first step, we examine the idealized 
situation in which the event durations are short compared to the detector sampling time, so that 
the time structure of the events cannot be resolved, and we assume white, Gaussian noise in two 
collocated, aligned detectors. For this situation we derive an appropriate version of the maximum 
likelihood detection statistic. We compare the performance of this statistic to that of the standard 
cross-correlation statistic both analytically and with Monte Carlo simulations. In general the max- 
imum likelihood statistic performs better than the cross-correlation statistic when the stochastic 
background is sufficiently non-Gaussian, resulting in a gain factor in the minimum gravitational- 
wave energy density necessary for detection. This gain factor ranges roughly between 1 and 3, 
depending on the duty cycle of the background, for realistic observing times and signal strengths 
for both ground and space based detectors. The computational cost of the statistic, although signif- 
icantly greater than that of the cross-correlation statistic, is not unreasonable. Before the statistic 
can be used in practice with real detector data, further work is required to generalize our analysis 
to accommodate separated, misaligned detectors with realistic, colored, non-Gaussian noise. 



I. INTRODUCTION AND SUMMARY 

Along with a new generation of gravitational wave de- 
tectors around the world detection algorithms 
for a variety of sources are nearing completion. If the 
signals from these sources are detected, physicists stand 
to harvest unprecedented quantities of observational in- 
formation concerning the nature of gravitation and the 
cosmos as a whole. The fruit of this harvest will be the 
outputs of detection algorithms. In this paper we intro- 
duce an algorithm designed for nearly optimal detection 
of a class of gravitational wave stochastic backgrounds. 
The non-Gaussian nature of this class of backgrounds 
causes the algorithm presented here to differ from the 
well studied cross-correlation based algorithms which are 
nearly optimal for Gaussian backgrounds. 
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Detecting a gravitational wave stochastic background 
produced by any one of these candidate sources could 
provide information on a variety of topics ranging from 
the evolution of the star formation rate |l5l| to the num- 
bers and sizes of posited extra dimensions [l6j . Because 
of this, stochastic backgrounds have long been thought 
to be among the most interesting possible types of grav- 
itational radiation. 



B. Gaussian stochastic backgrounds 



A. Gravitational wave stochastic backgrounds 

Consider a large collection of similar gravitational wave 
sources. If we cannot resolve the individual signals pro- 
duced by these sources and know only their statistical 
properties, the signals form a stochastic background. A 
wide variety of candidate sources of gravitational wave 
stochastic backgrounds have been studied (for an excel- 
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In order to develop detection methods, it is tradition- 
ally assumed that the individual events making up a 
background are uncorrelated and sufficiently frequent for 
the background to be Gaussian. That is, it is assumed 
that the conditions for applicability of the central limit 
theorem are satisfied. 

Unlike electromagnetic waves, gravitational waves can- 
not be screened from a detector. Using a single gravita- 
tional wave detector, there is no practical way to distin- 
guish between detector noise and a stochastic background 
of gravitational waves. As a consequence the sensitiv- 
ity of a single detector to gravitational backgrounds is 
severely limited. By comparing the outputs of multiple 
detectors, sensitivity levels can be enhanced. Michelson 
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[TtI I was the first to give a detailed description of such a 
detection method for a Gaussian stochastic background 
of gravitational waves in the presence of Gaussian detec- 
tor noise. His detection strategy and its later refinements 
fl8, 19. 2QJ are often referred to as the cross-correlation 
method. Recently the cross-correlation method has been 
modified to treat more realistic detectors which them- 
selves have sources of non-Gaussian noise 21, 22, 23J. 

We now briefly review the cross-correlation method. 
Consider two gravitational wave detectors. The output of 
each detector is a collection of dimensionless strain mea- 
surements. Suppose that N such measurements are made 
with each detector at regular time intervals. Denote these 
measurements by a iV x 2 matrix h with components h^, 
where z = 1, 2 labels the detector, and k = 1,2, ... ,N 
is a time index. To determine whether or not the data 
h contains some desired signal, one usually compares the 
value of some detection statistic r(/i) to a threshold value 
r*. That is, if T{h) > one concludes that a signal 
is present and otherwise one concludes that no signal is 
present. A detection statistic is said to be optimal if it 
yields the smallest probability of mistakenly concluding 
a signal is present (false alarm probability) after choos- 
ing a threshold which fixes the probability for mistakenly 
concluding a signal is absent (false dismissal probability) . 

Assume that the two detectors are collocated and 
aligned, and that each detector has white Gaussian noise 
with vanishing mean with no correlations between the 
two detectors. Then the standard cross-correlation de- 
tection statistic Acc for a Gaussian signal is 



C. Non-Gaussian stochastic backgrounds 

A particular class of events will produce a Gaussian 
background if, on average, at any given moment, many 
individual events are arriving at the detector. However, 
if the ratio of average time between events to the av- 
erage duration of events is large, then there are long 
stretches of "silence" or time during which no events 
arrive at the detector. The resulting stochastic back- 
ground is non-Gaussian as the conditions for the appli- 
cability of the central limit theorem are not satisfied. 
Recent work has suggested that some candidate grav- 
itational wave stochastic backgrounds, of both cosmo- 
logical and astrophysical origin, may be non-Gaussian 
O^IE^I' However, predictions concerning the proper- 
ties of most gravitational wave background sources rely 
heavily on theoretical arguments which extrapolate well 
beyond observational support. Such extrapolations are 
always in some sense speculative. It is conceivable that 
backgrounds predicted to be Gaussian may in fact turn 
out to be non-Gaussian, or vice versa. 

In Sec. nil CI below, we apply a maximum likelihood 
framework to derive a detection statistic for a particu- 
lar model of non-Gaussian stochastic background, which 
we now describe. Let be the outputs of two collo- 
cated aligned gravitational wave detectors with white, 
zero-mean, Gaussian noise with no correlations between 
the two detectors. The detector outputs consist of 
noise together with a common signal s*^: 
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for i — 1,2, and 9{x) is the Heaviside step function de- 
fined by 
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This statistic is nearly optimal and can be derived from 

The 
The 



a maximum likelihood framework (see Sec. IIII B)l . 
subscript CC in Acc denotes "cross correlation", 
generalization of this statistic to allow for colored noise 
and non-collocated, non-aligned detectors is discussed in 
Refs. |17|,d^aa,20j. 



We wish to detect a non-Gaussian signal composed 
of long stretches of silence which separate short bursts 
whose amplitudes are Gaussianly distributed, and whose 
durations are smaller than the detector resolution time 
(see Fig.0J. We therefore assume that each signal sample 
is statistically independent with probability distribu- 
tion [cf. Eq. below] 



p{s) = e 
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+ {i-ms). (1.7) 



The parameter ^ is what we call the Gaussianity param- 
eter of the stochastic background; it is the probability 
that, at any randomly chosen time, a burst is present 
in the detector. Thus ^ takes values in the interval 
< ^ < 1, and if ^ = 1 then the background is Gaussian. 
The parameter ^ can also be thought of as the duty cycle 
of the background. The parameter a in Eq. (|1.7|) is the 
rms amplitude of the bursts. 

Our nearly-optimal detection statistic for the sig- 
nal model Ijl.Tfl is given by [cf. Eq. (|3.22|) below] 
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Here the quantities a\ and tT2 are defined by Eq. 1)1. 4|l . 

The values of ^, a^, (j\ and cr| which achieve the maxi- 
mum in Eq. p.8|l are, respectively, estimators of the sig- 
nal's Gaussianity parameter, the variance of the signal 
events, and the variances of the noise in the two detec- 
tors. If we calculate the quantity 1)1. 8|l at ^ = 1, instead of 
maximizing over ^, the result is a statistic which is equiv- 
alent to the standard cross-correlation statistic Acc ■ 

The subscript ML on Aj^^ stands for "maximum 
likelihood", while the superscript NG stands for "non- 
Gaussian statistic" . The superscript NG does not neces- 
sarily mean that one is considering a non-Gaussian sig- 
nal; both of the statistics Acc and Aj^^ can be applied 
to data containing either a Gaussian signal or a non- 
Gaussian signal. 

If the burst-amplitude parameter a is sufficiently large 
and the bursts are well separated in time, then the indi- 
vidual bursts can be seen in the detector output. In this 

case one could use, for example, the simple burst statistic 
1 

Ab = max \h\\. (1.9) 

l<fe<Af ' ' 

on the data from detector 1 to detect the signal. The 
burst statistic (|1.9|l and the cross-correlation statistic 
Acc S'l'e used as references for comparison for the maxi- 
mum likelihood statistic below. 



D. Main results 

There are two main results in this paper. The first 
result is the detection statistic Aj^^ given by Eq. H1.8|l . 
which is derived in Sec lIIlCl This statistic is nearly op- 
timal for the detection of a class of non-Gaussian gravi- 
tational wave stochastic backgrounds incident on a pair 
of idealized detectors. 




FIG. 1: This plot shows the minimum gravitational- wave en- 
ergy density ^detectable necessary for detection, for several dif- 
ferent detection statistics, as a function of the background's 
Gaussianity parameter ^. The Gaussianity parameter ^ is 
the probability that, at any randomly chosen time, the waves 
from an event are incident on the detectors, and thus takes 
values in the interval < ^ < 1. For a Gaussian background 
^ = 1. The circles are the results of our Monte Carlo simula- 
tions for the maximum likelihood statistic ts^^, and the solid 
curve shows the approximate theoretical prediction IIC46t and 
IIC55II for this statistic (expected to be accurate only to within 
a few tens of percent). The crosses are the Monte Carlo re- 
sults for the cross-correlation statistic Acc, and the dashed 
curve shows the theoretical prediction 14.211 for this statistic. 
Finally the squares are the Monte Carlo results for the burst 
statistic (11.911 . and the dotted curve shows the corresponding 
theoretical prediction given by Eqs. (14.91 and 14.101 . For each 
statistic, the vertical error bars on the Monte Carlo simula- 
tion results give the fluctuations computed from 4 different 
runs, each with 2000 trials. The number of data points is 
TV = 10**, and the false alarm and false dismissal probabilities 
are both 0.1. A detailed description of the simulations and 
the analytical predictions can be found in Sec. IIVI 



^ In reality the statistic 11.91 would be especially susceptible to 
non-Gaussian noise bursts in the detector and so would not be 
used in practice; instead one would need search for events where 
\h\\ and \h\\ are simultaneously large. In this paper we restrict 
attention for simplicity to Gaussian detector noise; it will be 
important for future more general analyses to to allow for (un- 
correlated) non-Gaussian noise components in the two detectors. 



The second main result, summarized in Figs. Q and |21 
is a comparison of the performances of the maximum like- 
lihood statistic AJ^lj the cross-correlation statistic Acci 
and the burst statistic Ab- That comparison is quanti- 
fied in terms of the the minimum gravitational-wave en- 
ergy density fidctcctabic necessary for detection. The val- 
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ues of this quantity for the three different statistics AJ^^j 
Acc and Ab we will denote by ^^^etcctabic ^detectable' and 
^detectable' respectively. Results for these three quanti- 
ties obtained from Monte Carlo simulations are shown 
in Fig. ^ which gives rJdotectabie as a function of ^ for 
N — 10^ data points. The Monte Carlo simulations are 
described in Sec. IIV Bl below. The figure shows that in 
the limit ^ 1 of Gaussian signals, the statistics A^^ 
and Acc perform approximately equivalently (the cross- 
correlation statistic is slightly better). As the Gaussian- 
ity parameter ^ is decreased, the performance of A^^ 
improves, until at ^ ~ 10~^-^ it is better than that of 
Acc by about a factor of 3 in energy density. Finally, in 
the limit ^ ^ 0, the individual bursts become visible and 
the burst statistic Ab becomes the best statistic. 

Figure ^ also shows theoretical curves for the three 

quantities f^^tectabic ^do^ectabic and ridctoctabie- Thcsc 
curves are derived and discussed in Sec. IIVI below. For 
the burst and cross-correlation statistics, the theoretical 
curves should have a fractional accuracy ^ 1/\/N. For 
the maximum likelihood statistic, the theoretical predic- 
tion is expected to be accurate to a few tens of percent. 
These expected accuracies are confirmed by the Monte 
Carlo simulations, as seen in Fig. 

The value N = 10* of the number of data points 
is roughly appropriate for a space based detector like 
LISA, for which the duration of a measurement might 
be 1 year and the effective bandwidth ~ 10^'' Hz. 
However, for year-long observations with ground based 
detectors, the effective bandwidth will be ^ 100 Hz and 
consequently the appropriate value of is ^ 10^. We 
were unable to perform Monte Carlo simulations for this 
large value of TV due to limitations in available comput- 
ing power. However, we show in Fig. [21 the theoretical 
curves for the three different statistics as functions of ^ 
for N = 10^. In this case, the maximum likelihood statis- 
tic starts to outperform the cross-correlation statistic at 
^ ~ 10""^, and the maximum gain factor in energy density 
is of order ^ 2. 

We next discuss the computational cost of the max- 
imum likelihood statistic Aj^^- is well known, the 
computational cost of trying to detect a stochastic back- 
ground using the cross-correlation statistic Acc is neg- 
ligible when compared to, say, matched-filter-based in- 
spiral waveform searches. However, because of the non- 
trivial maximization in Eq. H1.8|l . the maximum likeli- 
hood statistic A|^L is computationally intensive. In fact, 
every evaluation of the function to be maximized over 
the four parameters ^, a, cti, and (T2 requires comput- 
ing a length- sum or product, where N is the number 
of data points, and takes longer than the entire cross- 
correlation detection method. Depending on the method 
of calculation, the computational cost of computing Aj^^ 
is larger than that of computing Acc by a factor any- 
where from 10^ to 10*. 

To summarize, under the idealized assumptions of this 
paper, if one searches for a stochastic background using 
the standard cross-correlation statistic, then one might 
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FIG. 2; The minimum gravitational-wave energy density 
f^dctcctabic necessary for detection as a function of the back- 
ground's Gaussianity parameter ^ for TV — lO'"* data points, 
which is a realistic number of data points for ground based de- 
tectors. The false alarm and false dismissal probabilities are 
both 0.01. The solid line is the theoretical prediction (IU46II 
and IGSSII for the maximum likelihood statistic, which is ex- 
pected to be accurate to a few tens of percent. The dashed 
line is the theoretical prediction l|4.6^ for the cross correla- 
tion statistic, and the dotted line is the theoretical prediction 
14.911 - 14. ion for the burst statistic; see caption to Fig.0 This 
plot indicates a maximum gain factor of ~ 2 in energy density 
for duty cycles in a narrow band near ^ ^ 10~*. 



not detect a signal that would have been detectable using 
our maximum likelihood statistic. This conclusion prob- 
ably generalizes to realistic detector noise models and 
detector orientations. 



E. Outline of this paper 

In Sec. ^ we introduce notation, review the general 
theory of signal detection and parameter measurement, 
and derive a general form of the maximum likelihood 
detection statistic. Then, in Sec. lIIII we derive the maxi- 
mum likelihood statistics for both a Gaussian background 
(Sec. nil Bll and for the model (|1.7|l of a non-Gaussian 
background (Sec. IIII Cp . assuming two idealized detec- 
tors. In Sec. IIVI we discuss analytical calculations and 
Monte Carlo simulations comparing the performance of 
the maximum likelihood and cross-correlation detection 
statistics. Also in Sec. lIVI we show how the signal param- 
eters ^ and a can be estimated, with reasonable accuracy, 
for a strong non-Gaussian background. We conclude in 
Sec. 13 with a discussion of the results. 
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II. GENERAL THEORY OF DETECTION 
STATISTICS AND PARAMETER ESTIMATION 

In this section we review various formal aspects of the 
theory of signal detection and measurement. We derive a 
form of the maximum likelihood detection statistic that 
is more general than has been considered before in the 
context of gravitational wave data analysis pQ, .24, .25^ ,26 . 
|2^ . The material in this section can be found in a variety 
of texts (28l| ; we include this section for completeness and 
to introduce notation. 



and dvl^ is the Ith component of (iv„ . We also use vertical 
bars to denote conditional probabilities. For example 

PM\viri\^n)d n= — — ^ d™^n 2.5) 

is the probability that n < M < n + dn given that v„ < 
V„ < v„ + dv„. 

We will often use the so-called total probability theo- 
rem _29J to write probability densities for a specific ran- 
dom variable as an integral over the functional depen- 
dencies of that random variable. An example is 



A. Notational conventions 

We use calligraphic letters A,B,C,... to denote ran- 
dom variables. As described in Sec. II BL given D detec- 
tors we can assemble an N x D detector output matrix 
H with components TC^ where k = 1,2, ... ,N is a time 
index, and i — 1,2, ... ,D labels the detector. We assume 
that the detector outputs are made up of noise Af and 
signal S with components J^^ and 5f respectively, such 
that 



n = jv + s. 



(2.1) 



Specific realizations of random variables will be denoted 
by lower case Roman symbols. For example, h = n + s is 
a specific realization of Eq. H2.1|l , where the components 
of h are h'j^. 

Probability densities for random variables will always 
be denoted by a lowercase p and will carry a subscript to 
indicate which random variable is being described. For 
example, pj^{n)d^^n is the probability that n < Af < 
n + dn, where d^^n is the product 



N D 



HUM 

k=l i=l 



(2.2) 



We write the normalization requirement for p_\f{n) as 



iND 



n pM{n). 



(2.3) 



Unless otherwise specified, integrals are over where 
K is the set of real numbers. 

We assume a detector noise model with Qn parameters. 
Let V„ be a vector of length Qn whose components are 
the parameters characterizing the noise in the detectors. 
We denote by 6„ the space of all possible values of V„. 
Here the subscript n is not an index; it is merely short for 
"noise" . We denote joint probabilities in the usual way. 



For example, pj^ y^{n, v„)(i 



• Vn is the probability 



that n < J\f < n + dn and v„ < Vn < v„ 
dQ "Vn is defined by 



d^"v„ 



Y[dvi, 

1=1 



dvn, where 



(2.4) 



PN-in) 



d'^"Vn PAA|V„("-|Vn)pv„(v„). 



(2.6) 



Expanding probability densities in this way allows us 
to treat parameters, such as the noise parameters Vn 
in Eq. H2.6|l . as unknowns. In fact, such a treatment 
of the noise parameters is the crucial difference be- 
tween the derivations of this work and those in previ- 
ous studies of g ravitational wave data analysis techniques 

Hi H El El m. 

We assume that the signal model contains Qs param- 
eters, which we will treat as random variables like the 
noise parameters. We will denote by Vs the random vec- 
tor of length Qs containing the signal parameters, and 
by 8s the space of all possible values of Vs. 

We define the notions of "signal present" and "signal 
absent" in terms of a partition of the space Qs of signal 
parameters into a disjoint union 



Qs = 6^0 u e 



(2.7) 



where Qso corresponds to the signal being absent, and 
Qsi the signal being present. We define the random vari- 
able T, taking values T = or T = 1, according to 



T 



1 if Vs e Qsi 
if Vs e Qso 



(2.8) 



Thus T = 1 corresponds to a signal being present, and 
T = to no signal being present. We define 



P5|v^,r(s|vs,0) 





<5^^(.s) 



if Vs G 6si 
if Vs e Qso 



(2.9) 



where S'^^{s) is the N x D dimensional Dirac delta func- 
tion. We denote by PT,n{ti h)d^^h the probability that 
T — t and that h < H < h + dh, where i = or 1. 
Similarly 



Pn\Tmd''''h 



Prit) 



(2.10) 



is the probability that h < H < h + dh given that T ^ t. 

We denote probabilities (as opposed to probability 
densities) with an uppercase P. For example Pt(1) is 
the probability that a signal is present, and Pt{0) is the 
probability that a signal is absent. 
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Before examining the detector outputs, we may have 
some idea, say from previous experiments, of the proba- 
bihty that a signal will be present. We denote this prior 
probability by P'-^\ We denote by P^^^ the posterior 
probability that the signal is present after examining Ti. 
in the context of all prior experiments etc. All posterior 
quantities have an implicit dependence on the detector 
outputs. To simplify the notation we will not explic- 
itly show this dependence. For example, we write P'^' 
rather than the more cumbersome P'^^\'H) for the pos- 
terior probability that a signal is present. 

There are prior and posterior versions of all proba- 
bility densities. When necessary we will append super- 
scripts of (0) and (1) to distinguish priors and posteriors 

respectively. For example Pyj(v„) = Pv„|H(vn|/i) is the 
posterior probability density for V„. The posterior distri- 
bution for the noise can be expanded in terms of (v„) 
as 




'Wn PAr|V„("-|Vn)Pvj(v„). (2.11) 



0.4 0.6 



FIG. 3: False dismissal versus false alarm curves for typical 
detection statistics. 



The conventions and symbols which have been intro- 
duced above are summarized in tables ^ and ^ respec- 
tively. 



B. Detection statistics 

To detect a signal one uses a detection statistic, say 
F — F(7i), that is some function of the detector outputs 
Ti. A signal is said to have been detected when F exceeds 
some threshold value F* . 

Denote by Pfd(F*) the probability of false dismissal, 
that is, the probability that we fail to detect a signal 
which is actually present. Similarly, let Pfa(F*) be the 
probability that we claim to have detected a signal which 
in fact is absent — the probability of false alarm. For given 
signal and noise models and for a given statistic F, the 
false alarm and false dismissal probabilities generate a 
curve in the Pfa-Pfd plane parametrized by the thresh- 
old F*. Such curves depend on the number of detectors 
D, the number of data points N , the signal parameters 
Vs, and the noise parameters V„. 

Suppose that the statistic F is bounded in the sense 
that there exist numbers Fmin and Fmax such that Fmin < 
F < Finax for all Ti. Then it is clear that PFD(Fmin) = 
and that PFA(Fmin) = 1. As the threshold F* increases 
toward Fmax, Pfd(F*) will increase while Pfa(F*) de- 
creases, until finally at F* = F^ax, Pfd = 1, and 
Pfa — 0. Thus, false dismissal-false alarm curves gen- 
erally look something like those sketched in Fig. |3| 

Note that if one uses a different statistic /(F), where 
/ is any function, then the shape of the Pfa-Pfd curve 
does not change as long as / is monotonic in the sense 
that 



Only the parametrization of the curve changes under such 
a transformation. Statistics related by transformations / 
satisfying the monotonicity property H2.12|l have identical 
false dismissal versus false alarm curves. 

In 1933 Neyman and Pearson considered a simple sig- 
nal detection scenario where the sets 0„, Osi, and Oso 
each contain a single element 30| . They showed that for 
this scenario the detection statistic which minimizes Pfd 
for any Pfa is the so-called likelihood ratio A, defined by 



A 



Pn\T{h\^) 
Pn\T{h\Q)' 



(2.13) 



F > F* 



/(r)>/(FO. 



(2.12) 



One notion of optimality for detection statistics is that 
the statistic should minimize the false dismissal proba- 
bility at a fixed value of the false alarm probability. For 
the simple scenario above, this criteria, known as the 
Neyman-Pearson criteria, uniquely determines the like- 
lihood ratio as the optimal statistic [sj- However in 
general, when any of 0„, O^i, or O^o contains more than 
one element, the statistic selected by this criteria is a 
function of the unknown parameters Vs and V„. Thus, 
as is well known, the Neyman-Pearson criteria does not 
single out a unique statistic in such cases. 

In this paper we will obtain our detection statis- 
tics from Bayesian considerations, but we will quantify 
their effectiveness using the Neyman and Pearson crite- 
ria of comparing false dismissal probabilities at fixed false 
alarm probabilities. 



C. Likelihood ratio and likelihood function 

From a Bayesian point of view, a natural criterion for 
deciding that a signal is present is for the posterior prob- 
ability p(^' to exceed some threshold 32]. The posterior 
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TABLE I: A summary of conventions introduced in Sec. Ill A1 



Convention 

Random variables are denoted by upper case calligraphic 
letters. 

Specific realizations of random variables are denoted by lower 
case Roman letters, (see next convention) 

A lower case p denotes a probability density function (PDF) . 
It's subscript determines the quantities with which it is as- 
sociated. 

A comma in a PDF subscript and argument indicates a joint 
PDF. 

A vertical bar in a PDF subscript and argument indicates a 
conditional PDF. 

An upper case P denotes a probability. 

Prior and posterior quantities are denoted by superscripts of 
(0) and (1) respectively. 



Example 

The detector output matrix is denoted by 7i. 

A specific observation run may result in a specific detector 
output matrix h or say x. These results would be denoted 
Ti. = h and Ti. — x respectively. 

The PDF for the detector output 7i as a function of h, or 
say X, is denoted by pn{h) and pn{x) respectively. 

The joint PDF for M and V„ as a function of n and v„ 
respectively is denoted by PAr,v„ ("■, v„). 

The conditional PDF for J\f and V„ as a function of n and 
v„ respectively is denoted by p^|v„(n|v„). 

The probability that T = 1 is denoted by Pr(l). 
The prior probability that a signal is present is denoted by 
P^°\ while the posterior probability that a signal is present, 
after an observation TL = h, is denoted by P'-^-' = Pq-\n{^\h)- 



TABLE IL A summary of symbols introduced in Sec. Ill A1 

Symbol Meaning 

Ti, h detector output matrix 

N , n noise contribution to detector output matrix 

<S, s signal contribution to detector output matrix 

N number of strain samples taken from one detector 

D number of detectors 

Qn number of parameters in the model noise PDF 

Qa number of parameters in the model signal PDF 

Vn , v„ the parameters of the model noise PDF 

Vs , Vs the parameters of the model signal PDF 

On the space of all possible values of V„ 

Os the space of all possible values of Vs 

QsO the subspace of Oj, for which a signal is absent 

Bsi the subspace of Oa for which a signal is present 

T,t 1 if a signal is present (Vs G Osi), otherwise 

P'"-* prior probability that a signal is present 

P'^' posterior probability that a signal is present 



probability P*-^-* is related to the prior probability P'-"-' 
and to the likelihood ratio A defined by Eq. (|2.13|) by 



p(i) 
1 - P(i) 



A 



p(0) 

1 - P(o) ■ 



(2.14) 



See appendix^for a derivation of Eq. (|2.14() in the most 
general context where the sets Q„, ©si, and Q^o are 
all non-trivial. It follows from Eq. (|2.14(l that P^^^ is 
a monotonia function of A, so thresholding on P^^^ is 
equivalent to thresholding on A. This makes A, or ap- 
proximate versions of it, the natural choice for a detection 
statistic. 

We derive in Appendix^the following general formula 
for the likelihood ratio as a function of the data Ti — h: 



A 





1 L 







d^"vn PAr\v„{h - s|vn)pv„(v„)p5|v^,r(s|vs, l)pv,|r(vs|l) 



(2.15) 



The various probability distributions that appear in Eq. 
(|2.15l) are (i) the prior distribution pv^|7-(vs|l) for the 
signal parameters ; (ii) the distribution p5|y 7-(s|Vs,l) 



for the signal s given the signal parameters v^; (iii) the 
prior distribution pv„(vn) for the noise parameters v„; 
and (iv) the distribution Ptv'IV i^'-l'^n) for the noise n 
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given the noise parameters v„. 

We can interpret Eq. (|2.15|l as follows. In the simple 
signal detection scenario, we choose between a pair of 
simple claims: (i) = v^o or (ii) Vs = v^i. In general 
we choose between a pair of complicated, or composite, 
claims: (i) Vs G Qso or (ii) Vs G 6si, where both 9^0 
and Qsi contain many elements. Equation H2.15|l says 
that the best way to chose between a pair of complicated 
claims is to first break the complicated pair of claims into 
pairs of simple claims, then compute the likelihood ratio 



The likelihood function^ A(vs, v„) can be used to com- 
pute the posterior probability density Py |7-(vs, v„|l) 
for the signal and noise parameters given that a signal is 
present, via the formula 

. p(0) 

(2.18) 

A derivation of Eq. (|2.18|l can be found in appendix ^ 



D. Maximum likelihood detection statistics and 
parameter estimators 

In many applications, it is impractical to compute 
the detection statistic (|2.15|) because of the multi- 
dimensional integrals involved (331] . However, approxi- 
mate versions of the statistic are often easier to compute 

^ There are two different conventions for the definition of the like- 
hhood function. Some authors include the probability distribu- 
tions for Vs and Vn in the definition of A(Vi,, v„) as we have in 



is a natural approximate version of A ^. The subscript 
ML denotes that (|2.19() is the maximum likelihood ap- 



^ In the event that the priors for Vs and Vn restrict these param- 
eters to regions 0'^-^ C ©3I and 6J, C &„, the bounds of the 



for each pair of simple claims, and sum the results of each 
choice. That is, the likelihood ratio can be written as an 
integral over the parameters of the composite claims 

A= [ d^'vs [ A(v„v„), (2.16) 

where the integrand A(vs,v„), which we refer to as the 
likelihood function, can be read off from Eq. I|2.15(l : 



(2.17) 



I 

and useful. If a signal is present with sufficiently large 
amplitude, then the integrand in the numerator of Eq. 
(|2.15l) will be sharply peaked. The integrand in the de- 
nominator of Eq. H2.15|l will also be sharply peaked when 
there is sufficient data that the noise is well characterized. 
Under these circumstances, the integrals can be written 
as the values of the corresponding integrands at the peaks 
multiplied by "width factors" , where the width factors 
depend only weakly on the data h and can be neglected 
without affecting much the performance of the statistic. 
[The width factors from the integrals over the noise pa- 
rameters will tend to cancel between the numerator and 
denominator]. Also, frequently the prior distributions for 
Vs and Vn are slowly varying, and neglecting those dis- 
tributions has a negligible effect on the performance of 
the statistic. Under these conditions the maximum like- 
lihood detection statistic Aml defined by 

Eq. 1^.171 . while others leave these out of A(Vi,,v„) and would 
show these distributions explicitly in Eq. 12.161 . 



(2.19) 

I 



maximizations in Eq. 12.191 should be changed to @si —* &'si 
and 6n — ► S'„ ■ 



A(vs, v„) 



jND, 



Paa I v„ ( ^ - s I v„ )P5 1 , r (s 1 , 1 )py„ ( v„ )py^ I r ( Vs , 1 ) 



Aml — 



max max d^^sp^i^ {h ~ s\vn)ps\v ,t{sWsA) 



max p^f\vJh\VJ 
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proximate version of A. See Ref. [23 for further discus- 
sion of Aml as an approximate version of A ^. 

A particular special case of the detection statistic 
(|2.19l) . which is widely used, is the following. As- 
sume that the noise parameters have some known values 
V„ = v„. Then the noise priors and the 0„ integrals 
in Eq. (|2.15() arc trivial, and one obtains the detection 
statistic 



Aml = 



max 



s PAf\vS^ - s|v„)p5|v^,r(s|vs, 1) 



PN-\v (^|v„) 

(2.20) 

See Ref. |2a| for an exploration of the statistic (|2.2()|l 
in the context of stochastic backgrounds. We will show 
below that for a Gaussian stochastic background, Aml 
reduces to the standard cross-correlation statistic while 
the more specialized statistic Aml does not. Thus for 
stochastic backgrounds, treating the noise parameters as 
unknowns is crucial |2^ . 

When the noise and signal parameters V„ and can 
take on many values, one naturally would like to know 
which values are realized. Equation (|2.18f) suggests using 
the values v„ and defined by 



A(v, 



max max A(vs,v„). 



(2.21) 



The estimators v„ and Vj are known as maximum like- 
lihood estimators. Note that — and v„ = v„ also 
maximize the numerator in Eq. (|2.19l) . For the remain- 
der of this paper we will use Aml, defined by Eq. H2.19|l . 
as our detection statistic, and and v„, defined by 
Eq. H2.21|l . as parameter estimators. 



III. APPLICATION TO STOCHASTIC 
BACKGROUND SEARCHES 

In this section we derive the maximum likelihood de- 
tection statistic (|2.19|l for a simplified model of the de- 
tection problem for stochastic gravitational waves, and 
for a specific simple model of a non-Gaussian stochastic 
background. 



the noise in both detectors to have vanishing mean and 
to be both Gaussian and white, so that 



N 



P7V|V„ ["1(0-1, Cr2)] = Yi T~ "^^P 



k=l 



27rCTiCT2 



(n 



fe\2 

1) 



{n: 



k\2 
2) 



2g\ 



2al 



(3.1) 

The parameters cti and 02 in Eq. (|3.1|l are the square 
roots of the variances of the noise in the two de- 
tectors. For this model v„ = (tTi,(T2) and 0„ — 
{(0-1,0-2) I ci > and 0-2 > 0}. 

We assume that the detectors are collocated and 
aligned, so that the same signal is present in both de- 
tectors 



(3.2) 



Lastly we assume that the individual signal samples are 
uncorrelated and identically distributed, i.e., the signal 
is white, so that 



N 



Ps{s) = Y[ps>'{s'')- 



(3.3) 



fc=i 



Our assumptions (|3.1|I - H3.3|I are unrealistic for both 
ground-based and space-based detectors: we expect the 
noise to be colored with significant non-Guasssian com- 
ponents, and in general detectors will not be co-located 
and aligned. Our analysis is therefore just a first step, 
and will need to be generalized. However, we expect 
that our central conclusion — the existence of statistics 
which outperform the standard cross-correlation statis- 
tic for nonGaussian signals — is robust, and will not be 
altered when these complications are taken into account. 

We now derive a general formula for the maximum 
likelihood statistic (|2.19|) . which we apply in both the 
Gaussian and non-Gaussian cases in the following two 
subsections. The denominator in Eq. H2.19|l can be writ- 
ten, from Eq. H3.1|) . as 



max max 

CTi>0 cr2>0 



-N 



exp 



(27ro-io-2^ 
where and o-| are defined by 





(-1 











(3.4) 



A. Assumptions 

We assume two detectors with outputs 7Y^, where 
i = 1, 2 labels the detector and fc = 1, 2, . . . , is a time 
index. We assume that the noise in detector one is un- 
correlated with the noise in detector two. We will require 



1 ^ 
N ^ 



k=l 



(3.5) 



for i = 1,2. It is easily shown that the maximum in 
Eq. (|3.4|l is achieved at CTj = at. From Eq. (|2.19() this 
yields 
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* Note that Aml is an approximate version of A only in the sense 
that the false dismissal versus false alarm curves of the two statis- 
tics will be close to one another. The numerical values of Aml 
and A will in general differ significantly, due to the width fac- 



tors and priors. Therefore the statistic Aml cannot be used in 
Eq. 12.141 to compute Bayesian thresholds for detection given a 
desired value of P^^\ 



Aml — 



max max 
vseSsi v„ee„ 



Pa^|v„ [h - s|v„)p5|v^,r(s|vs, 1) 



(27rcricr2) ^exp(— iV) 

Combining this with Eq. yields the foUowing final general expression for the maximum likelihood statistic 



Aml = max max max |T / ds*^ pofcrn rfs'^lvs, 1) exp 

v,Ge,i (Ti>0 cr2>0 CTi 0-2 7-00 



k=l 



2<jI 



2ol 



+ 1 



(3.6) 



(3.7) 



r 



B. Gaussian signal 

We now consider the case where the signal is Gaussian 
and has a vanishing mean. We denote by o? the variance 
of the signal, so the prior for S is given by 



1 



27ra 



exp 



2a2 



(3.8) 



{a) has only one component, and 



For this model 
6,1 = {a I a>0}. 

Substituting the signal probability distribution (|3.8I) 
into the general expression (|3.7() for Aml yields a Gaus- 
sian integral which is straightforward to evaluate. The 
result is 



and (|3.12|l are the maximum likelihood estimators for 
the variance c? of the signal and the variances a\ and a\ 
of the noise in the two detectors. The step functions in 
Eqs. (|3.11(l and (|3.12(l arise as a result of the bounds of 
the maximization in Eq. (|3.7(l . 

The corresponding detection statistic is, from Eq. 13. 9|) 



A' 



ML 



-N/2 



(3.13) 



The cross-correlation statistic Acc can be obtained from 
Aml ^ monotonic transformation which preserves 
false dismissal versus false alarm curves [cf. Eq. (I2.12f) 
above] : 



'-ML 



X exp 



max 

(T1>0 


max < 

cr2>0 


(71 (72 




+ (7^q;2 


+ (7|a2 










— =- + 1 

2al + 






+ ^) 


2al 



(3.9) 



N 



where 



1 ^ 

-yh\h 



(3.10) 



fe=i 



and we have appended a superscript G on A^^ to in- 
dicate the maximum likelihood detection statistic for a 
Gaussian signal. 

One can show that the maximum in Eq. (|3.9|l is 
achieved dX a = a, ai — di, and 02 — 02, where 



Acc = 



Vi-(ASl)- 



2/N 



a 



(7l(72 



(3.14) 



Note that if we had assumed the noise parameters 
Vn = (ci,CT2) were known, and derived a statistic from 
Eq. I|2.20|l rather than Eq. H2.19|l . we would have found 
instead the detection statistic A^^ = Aml ^(Aml)' 
where 



AJJl = a' 



2 2 

^^2 1-2 2\ . '^1 /-2 2\ 

-oK - ^1) + -2\^2 - ^2) 



, (3.15) 



which is different from the standard cross-correlation 
statistic. This non-standard result is obtained because 
of the unrealistic assumption that the noise parameters 



6? 6{6?) 



(3.11) 
(3.12) 



for i = 1,2, and ai and (72 are given by Eq. (|3.5|l . Here 
9{x) is the step function H1.5|l . The quantities (|3.11|) 



^ To simplify the formula for we assume that cr^ — 



> 0. 

;2 _ 



This will be true for any realistic value of since ct^ ^ — 
'^itvuc + 0{1/VN), where ai, 

true is the true value of o"i and the 
second term describes the statistical fluctuations. 
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Vn = (o'i,o'2) are known. Different derivations of the 
result |XT3|l can be found in Refs. HHH. 

It is often useful to characterize the "strength" of a 
stochastic background in terms of the signal-to-noise ra- 
tio of the cross-correlation statistic (I3.14() , which we now 
define. First note that for large N, the fractional fluctua- 
tions in will be much larger than those in aia2 ^- For 
the purpose of defining the signal-to-noise ratio, we as- 
sume that N is large enough that ai and (72 in Eq. (|3.14() 
can be taken to be independent of h, so that Aqc and 
are equivalent detection statistics. We also use d^ 
instead of d^ in the computations that follow, as is con- 
ventional when defining signal-to-noise ratios. If a signal 
is present, then the expected value of d^ is, from Eqs. 

xrn . jnsj and urn^ . 



(3.16) 



If no signal is present, so that — 0, then the fluctua- 
tions in d^ are given by 



aicr2 



N 



(3.17) 



The signal-to-noise ratio p is defined to be the ratio of 
these two quantities: 



0- 



FIG. 4: Sketched segment of the signal produced by a model 
non-Gaussian stochastic background of events unresolved by 
the detectors. Here we show two events. The solid curve 
is the exact signal. This exact signal's contributions to the 
detector outputs, shown as stemmed o's, are averages of the 
exact signal over the detector resolution timescale. 



tTltT2 



(3.18) 



C. Non-Gaussian signal 

As mentioned in the introduction, the traditional as- 
sumption that a gravitational wave stochastic back- 
ground will be Gaussian requires the individual events 
to be sufhciently frequent and uncorrelated. Our model 
for a non-Gaussian signal assumes instead that the events 
are infrequent. 

Consider a collection of similar events generating a 
stochastic background S. Let ^ be the probability that, 
at any randomly chosen time, the waves from an event 
are arriving at the detectors. We assume that the time 
structure of individual events cannot be resolved by the 
detectors. That is, we assume that the events occur over 
timescales smaller than the detectors' resolution time, as 
illustrated in Fig. 0] We assume that the distribution of 
the amplitudes of the events is Gaussian with variance 
a^. The probability distribution for the signal given the 
signal parameters (f , a) is therefore given by 



■ exp 
Zira 



2a2 



together with Eq. H3.3|) . Thus the signal model pa- 
rameters are = (^, a), which give respectively the 
"event probability" and "event variance" characterizing 
the stochastic background. The parameter space 8s for 
this model is 

Gs = {(C,a) I < C < 1 and a > 0}, (3.20) 

and the subset corresponding to a signal being present is 



{(^,a) I < ^ < 1 and a > 0}. 



(3.21) 



Note that our assumption that the time structure of 
events is not resolved by the detector is unrealistic. De- 
tector resolution times can be as small as 0.1 ms in the 
case of ground-based detectors like LIGO ^, and even su- 
pernova bursts are expected to have time scales > 10 ms 
[34. 35]. It will be important for future studies to relax 
this assumption. 

We now compute the maximum likelihood detection 
statistic A|JjL for our simple non-Gaussian signal model 
by substituting Eq. (|TT^ into Eq. (|T7|) . This yields 



(3.19) 
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6 This is true at fixed signal-to-noise ratio p. ^^^^^^ ^^^^ 0.1 ms 113, which may help with this issue. 

For ground-based detectors, the eff'ective resolution time in 
a cross-correlation between two detectors can be considerably 



aNG 



max 

0<?<1 



max max max 

O!>0 CTi>0 tT2>0 



0x02 



(l-e)cxp 



N 

n 

fc=i 

, 2 



2al 



(Ml 

2aj 



afa^ 



exp 



(hi 



2al 



2a| 



(3.22) 



The values of ^, a^, af, and (t| which achieve the max- 
imum in Eq. H3.22|l are, respectively, estimators of the 
signal's Gaussianity parameter, the variance of the sig- 
nal events, and the noise variances in the two detectors®. 
Note that if we evaluate Eq. H3.22|l at ^ = 1, rather than 
maximizing over ^, we recover Eq. 13.9|l and the statistic 

We mention in passing an approximate version of the 
statistic (|3.22(l which is significantly easier to compute. 
Expanding the logarithm of the quantity to be maximized 
in Eq. H3.22|l as a power scries in to fourth order about 
— yields the approximate statistic A|^l given by 



4 8 8 y 2 \ " 

In A|^l ~ max max max max f 



o<5<i a>Q CTi>o <J2>o — ' — ' ■' — ' \ a\a2 

n— /— m— ^ 



(3.23) 



k=\ 



where the coefhcients Cnim{£,, f^, erf ) vanish unless I + m 
is even and I + m < 8. In evaluating the statistic H3.23|l . 
one can first evaluate the 24 sums 



N 
k=l 



(3.24) 



for the required values of / and m, and subsequently nu- 
merically maximize over the parameters ^, a, cri, and (72. 
Thus the length-iV sums need only be performed once, 
rather than each time one tries a new set of values for ^, 
a, (Ti, and (T2- Therefore the computational cost of Aj^^ 
is only about an order of magnitude greater than that of 
the cross correlation statistic A^c i s-^d this statistic may 
be useful to explore. 

We now derive the signal-to-noise ratio p for the 
cross-correlation statistic and for the non-Gaussian sig- 
nal H3.19|l . If the signal is present, then from Eqs. H2.1|l . 



See Ref. l36l for a derivation of a statistic similar to and 
designed for the same non-Gaussian signals which is based on 
Eq. rather than Eq. ITT^ . 



ijTTUIl . (j^lT|l and ijTT^ the expected value of 



a 



IS 



(3.25) 



If no signal is present then the fluctuations in are given 

by 



-2\ _ ^^1^2 



N 



(3.26) 



Therefore, taking the ratio of Eqs. H3.25|l and H3.26|l . the 

signal-to-noise ratio p is 



(3.27) 



(Tl(T2 



IV. PERFORMANCE COMPARISON 

In this section we compare the performances of the 
cross-correlation statistic (I3.14f) . the burst statistic (|1.9|l . 
and the maximum likelihood statistic (|3.22|l for our 
model non-Gaussian signal described in Sec. IIII CI The 
comparison is quantified in terms of the false alarm versus 
false dismissal curves, as discussed in Sec. m above. In 
Sec. II V Al we discuss analytic predictions for these curves 
for the three different statistics. Section FlV Bl describes 
our Monte Carlo simulation algorithm, and Sees. IIV CI 
and IIV D1 describe the results. 



A. Analytic computation of asymptotic behavior of 
statistics 

We start by discussing the set of parameters on which 
the false dismissal versus false alarm curves can depend. 
As before, we assume two detectors with noise charac- 
terized by Eq. H3.1|l with Vn = (ci,(T2), and a non- 
Gaussian signal characterized by Eqs. H3.3|) and (|3.19|) 
with Vs — {^,ct). The curves for each statistic are given 
by some function 



^FD = Pfb{Pfa, t a, (Ti, 0-2, A^) 



(4.1) 
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of the false alarm probability PpA, the Gaussianity pa- 
rameter ^, the rms amplitude a of events, the noise vari- 
ances af and (t|, and the number of data points N. We 
can simplify Eq. 1)4. 1|) by replacing a with the signal-to- 
noise ratio p using the definition (|3.27l) . and noting from 
dimensional analysis that PpA depends on ci and (72 at 
fixed p only through the ratio cri/(T2. This gives 



-PfD = PF-DiPFA,tp,<^l/(T2,N). 



(4.2) 



For simplicity, we specialize to <7i — for the remainder 
of this paper. This implies that 



PfD = PFD(PFA,^,P,iV). 



Cross correlation statistic 



(4.3) 




Pfa 



The false dismissal versus false alarm curves for the 
cross-correlation statistic can be computed analytically 
in the large N limit, as we now describe. Our derivation 
generalizes the analysis of Ref. |23| from Gaussian to non- 
Gaussian signals. For any detection statistic F, we can 
express PpA and PpD in terms of the detection threshold 
F>^ as 

/■OO 

PpA{T„ai,a2,N) = pr|r(a^|0), (4.4) 



Here the definition of the random variable T is such that 
if T = then no signal is present {£, — P = 0), and if 
T ~ 1 then a signal is present (^ 7^ and p 7^ 0); cf. 
Sec. Ill Al above. Note that by eliminating F* between 
Eqs. (|4.4|) and (|4.5|) . we recover Eq. (|4.1|l . 

In the large TV limit, the distribution p\^^\-rix\t) is 
a Gaussian by the central limit theorem, and the inte- 
grals H4.4|l and (|4.5|l can be evaluated analytically (see 
Appendix IB|| to give 



Pfd (Pfa,C,P,^)-1 



(4.6) 



- - erfc 
2 



crfc-i (2Pfa) - ^ 




O 



Here the function erfc (a;) (known as the compliment of 
the error function) is defined by 



erfc(a;) = — = 



dy e' 



(4.7) 



and erfc~^(a;) is the inverse of erfc(a;). The formula 
(|4.t)|) is valid only for PpA < 1/2; Pfd is undefined for 
1/2 < Pfa < 1. In deriving Eq. (I4.t)|) . we assumed that 



FIG. 5: Sample false dismissal versus false alarm curves for 
the cross correlation statistic Acc in the large TV limit, as 
prescribed by Eq. 1)4. 8|l . For these curves the signal-to-noise 
ratio p has equally spaced values from 0.01 to 1. Note that 
here Pfd is undefined for 1/2 < Pfa < 1. 



the statistics Acc and are equivalent, and that the dis- 
tribution for is Gaussian. Those assumptions are only 
valid up to fractional correction terms of order 1/\/TV; 
hence the indicated correction term in Eq. (|4.6(l . 

In the regime where ^ TV^ in addition to TV 1, 



(4.5) the result ()4.6|l simphfies to 



PFD(PFA,e,P,TV) = 1 



O 



- erfc 
2 

1 



erfc-i (2Pfa) - ^ 



O 



P_ 

TV 



(4 



Note that the false dismissal versus false alarm relation 
(|4.8|) is independent of both TV and ^. Sample curves 
from Eq. (|4.8|l are shown in Fig. |5| The discontinuities 
at Pfa — 1/2 are a result of the step functions in the 
definition H3.11|l of . 



2. Burst statistic 

By combining the definition H1.9|) of the burst statis- 
tic together with the decomposition (|1.6|l . the noise and 
signal distributions (|3.1(l and (|3.19|l . and the change of 
variables H3.27|l it is straightforward to derive the exact 
false alarm versus false dismissal relation. The result is 
given by 



(l-PFA)^/^=crf(A| 



(4.9) 
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and 



A, 



+ (l-Oerf(^j, (4.10) 



where A* is the value of the threshold. 



3. Maximum likelihood statistic 

We start by discussing the different regimes present in 
the space of signal parameters ^, p and N, treating the 
false alarm probability PpA as fixed. There are several 
different constraints on the three parameters ^, p, and 
N that define the regime in parameter space where we 
expect our maximum likelihood statistic to work well. 
First, it is clear that the total number of events ^ in 
the data set must be large compared to one: 



1 

N' 



(4.11) 



Second, if the signal-to-noise ratio a^/((JiCT2) of indi- 
vidual burst events is large compared to one, then one 
can detect the individual events using the burst statistic 
(|1.9|) and the method of this paper is not needed. From 
Eq. H3.18|l we can write the constraint a^/ (171(72) 1 as 



> 



N 



(4.12) 



A more precise version of this requirement can be ob- 
tained by noting that the detection threshold for the 
signal-to-noise ratio a^/(criCT2) is ^ \/2\nN, since there 
are N independent trials. This yields the constraint 



P 



V2N\nN 



(4.13) 



The regime ^ ~ p/v2N\nN is where the burst statistic 
Ab starts becoming as sensitive as the cross correlation 
statistic, as can be seen by combining Eqs. H4.8I) . (|4.9() 
and H4.10|l above. This behavior can also be seen in Figs. 
Hand [3 above. 

A third constraint on the space of signal parameters is 
derived as follows. Consider the statistic 



1 



N 

N ^ 

k=l 



(4.14) 



We can use this statistic to estimate the Gaussianity pa- 
rameter ^ in the following way. The mean value of rj 
when a signal is present is given by 



and the variance when a signal is absent is 



N' 



(4.15) 



(4.16) 



It follows from Eqs. H4.15|l . (|4.16|l . and the relation 
l^oi^) — that the estimator ^ of defined by 



i = 



has a fractional accuracy of order 



(4.17) 



(4.18) 



Now in the regime A^/^ ^ 1, we expect our maximum 
likelihood detection statistic to work well, since one's first 
guess for a nonlinear statistic (|4.14() can be used to de- 
tect the non-Gaussianity of the signal to high accuracy. 
In the regime A^/^ 3> 1, it is not obvious how the max- 
imum likelihood detection statistic will perform, since it 
could have a performance much better than that of the 
statistic 77. However, our Monte Garlo simulations [Sec. 
IIVBI below] and analytic computations [Appendix IC] in- 
dicate that the maximum likelihood statistic does indeed 
perform poorly in the regime A^/^ 3> 1. Thus, our third 
constraint is A^/^ < 1, which from Eq. H4.18|l can be 
written as 



< 



N 



(4.19) 



Our Monte Garlo simulations show that for p'^/vN < 
^ < 1, the maximum likelihood and cross-correlation 
statistics perform roughly equivalently, and that once ^ 
becomes smaller than p'^/y/N, the maximum likelihood 
statistic starts to perform significantly better than the 
cross-correlation statistic; see Figs. Q] and |2 above. 

In Appendix[niwe derive analytically the approximate 
expression (IG46|) for the false dismissal probability for 
the maximum likelihood statistic, which we expect to be 
accurate up to corrections of order 1/p* or a few tens of 
percent. We also derive the expression (|G55|) for the false 
alarm probability using a combination of analytical and 
numerical techniques. Gombining these results gives the 
curves which are associated with the maximum likelihood 
statistic and labeled "analytic" in Figs. [2 H El and 

cni 



B. Description of the Monte Carlo simulation 
algorithm 

Next we describe our Monte Carlo simulations of the 
performances of the various statistics. We numerically 
estimate the false dismissal and false alarm probabilities 
PpD and PpA by conducting an ensemble of Ne simulated 
experiments. For each experiment we simulate a detector 
output matrix, half of which have a signal present, and 
half of which do not. Since we know in advance whether 
or not a signal is present, we can easily estimate PpA 
and PpD- More specifically, our algorithm for simulating 
false dismissal versus false alarm curves, for an arbitrary 
statistic F, is as follows: 
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1. Choose values for ^, a, ai, (T2, and N. 

2. Choose the total number of trials Ne- 

3. Forr = 1,2, . . . , Ne/2: 

(a) Generate a data train h{ai,a2, N) of noise 
only. 

(b) Compute T and store result as Fro- 

(c) Generate a data train a, fii, (T2, N) which 
has a signal present. 

(d) Compute F and store result as F^i. 

4. Choose a discretization F*j of the set of thresholds, 
where j = 1, 2, . . . , M. 

5. Set Pya{T*j) = PFD(r*j) = 0, for each j. 

6. Forr = 1,2, Ne/2: 

(a) for each j, if F,.o > F*j, increment fVA(F*j) 
by 2INe. 

(b) for each j, if F^i < F,j, increment PFD(r*i) 
by 2/7Vb. 

7. Repeat steps 3-6 above several times to estimate 
the fluctuations in PFA(r*j) and PFD(r*j)- 

We use the above algorithm to simulate false dismissal 
versus false alarm curves for the three statistics Acc: Ab 
and h^^- The analytical expressions (|4.6|l and H4.9|l - 
(|4.1U|) for the cross-correlation and burst statistics are 
used as a check of the numerical method. 



C. Simulation results 

A family of simulated false dismissal versus false alarm 
curves for the cross correlation statistic Acc and the 
maximum likelihood statistic AJ2|£ is shown in Fig. 
We see that at fixed p, as the Gaussianity ^ of the signal 
decreases, Aj^^ performs increasingly better than Acc- 
The curves for Acc are almost indistinguishable from 
each other because p is fixed, and the curves depend only 
on p and not on ^ for this detection statistic in the large 
N hmit [cf. Eq. above]. 

If we maintain the same value for p as in Fig. El but 
take ^ > 0.03, the curves for Acc and cannot be 
distinguished from each other. We find in general that 
for any values of N , ai, a2, and p, as ^ — s- 1, the false dis- 
missal versus false alarm curves for Acc and cannot 
be distinguished from each other. Thus, the two statistics 
are nearly equivalent for Gaussian signals, as expected. 
However, for ^ <C 1, Fig. demonstrates that Aj^^ psr- 
forms noticeably better than Acc- 

We now discuss a comparison of the two statistics in 
terms of the minimum gravitational wave energy den- 
sity necessary for detection, instead of in terms of the 
false dismissal versus false alarm curves. For a stochas- 
tic background with rms strain amplitude /irms, we have 




Pfa 



FIG. 6: Plots of false dismissal probability (-Pfd) versus false 
alarm probability (-Pfa) for the standard cross-correlation 
statistic Acc and our maximum likelihood statistic AJJ[l. 
Each of these curves is characterized by a total number of 
trials A''^; = 2 x 10'', number of data points A'' = 5 x 10*, 
noise variances ai — a2 = 1, and by the signal-to-noise ratio 
p = 1. The values of the Gaussianity parameter ^ are 0.02, 
0.012, and 0.01. The solid curves are the results for A^Li 
these curves are bunched together because p is fixed. The 
dashed curves are the results for AJ^l- For the dashed curves, 
the lowest curve is for ^ = 0.01, while the highest curve is for 
^ — 0.02. We estimate error bars for each of these curves by 
separating the 2 x 10* runs into 10 bins of 2 x 10'^, and gener- 
ating 10 separate plots; the resulting fluctuations are < 10"'^. 
The curves for the cross correlation statistic A^l agree with 
the analytic prediction 14.611 to within ~ 10""^. This plot 
shows that AJJ[l can perform significantly better than Acc- 



n oc /i^jjjg where is the gravitational wave energy 
density. For our model signal (|3.19l) we have h^^^ oc ^a^, 
and comparing this with the formula p cx from Eq. 
(|3.27|) shows that we can interpret the signal to noise ra- 
tio p as the energy density in the stochastic background, 
even for non-Gaussian signals. 

We compute the minimum detectable energy density 
or signal-to-noise ratio Pdctcctabic as follows. First, we 
choose thresholds Pfa* and Pfd* for the false alarm 
and false dismissal probabilities. We refer to the pair 
(Pfa*,Pfd*) as the detection point. For any statistic F, 
the choice of detection point determines the detection 
threshold F* , and inverting Eq. H4.3|l gives the minimum 
detectable signal-to-noise ratio 



P = Pdotoctablc(PFA*,PFD*,C, A^), (4.20) 

as illustrated in Fig. [7| For the cross-correlation statistic 
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FIG. 8: The minimum detectable signal-to-noise ratio 
FIG. 7: A family of false dismissal versus false alarm curves pj^'^^^t^bio for the cross-correlation statistic Acc as a function 
for fixed f . Here the detection point, at Ppd, = Ppa, = 0.1, of the false alarm probability threshold Ppa*. Note that we 
is marked with *. assume the false dismissal probability threshold Ppd* ~ Pfa*- 



Acc the result is, from Eq. H4.6|l . 



r detectable 



2V27 [1 + 



+ 272 (1 - 1) /N 



l + O 



(4.21) 



(4.22) 

where 7 = erfc~^(2PFA*) and we have assumed that 
Pfa* = PpD*- This relation is plotted in Fig. |S1 

From the results of our simulations, we determine 
Pdetectabic(PFA*, Pfd*,'?, A^) by numerically solving the 
equation 



Pfd(Pfa*,C,P,^)-^fd* =0 



(4.23) 



for p. Unfortunately, evaluating the function on the left 
hand side of Eq. H4.23|l is computationally expensive. 
Each evaluation involves simulating the false dismissal 
versus false alarm curve which is itself a computationally 
intensive task. Moreover, it is only feasible for us to solve 
Eq. (|4.23() for values of iV < 10'* while a realistic detec- 
tion scenario for ground based detectors would involve 
a year's worth of data sampled at ~ 100 Hz for which 
N ~ 10^. Therefore our conclusions about the applica- 
bility of the method to ground based detectors are based 
on our analytic results, as discussed in the Introduction. 

Figure |^ shows the results obtained from numerically 
solving Eq. (|4.23|1 for pdotcctabio for the parameter values 
C = 0.02, Pfa* = Pfd* = 0.1, and Fig. UUI shows the 
corresponding results for ^ — 4.3 x 10"'^. For the cross- 
correlation statistic, the results are in good agreement 
with the analytic prediction (|4.21|) . 
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4,000 5,000 6,000 7,000 8,000 9,000 10,000 

FIG. 9: The minimum detectable signal strength pdotcctabic as 
a function of the number of data points N, for the false alarm 
probability threshold Pfa* = 0.1, false dismissal probability 
threshold Pfd* =0.1, and Gaussianity parameter ^ = 0.02. 
The circles are the simulation results, and the error bars are 
estimated from ten different runs. The solid curve is the an- 
alytical prediction H4.21^ for Acc, and the dotted line is the 
N ^ <x limit (14.2211 . The dashed line is the analytic predic- 
tion for given by Eqs. i^lel and (IHSSll . 



Figure^shows the minimum detectable energy density 
as a function of the Gaussianity parameter ^ for N — 10^ 
(corresponding to space based detectors), for the cross- 
correlation and maximum likelihood statistics and also 
for the burst statistic H1.9|l . We again use the values 
Pfa* = -PpD* = 0.1. The figure shows that the maximum 
likelihood statistic performs better than the other statis- 
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FIG. 10; Same as Fig. El but with ^ = 4.3 x 10"^ 



tics by a factor which is roughly 3 for ^ of order 1%. For 
smaller values of ^, the maximum likelihood performs in- 
creasingly better than the cross-correlation statistic, but 
is eventually comparable to the burst statistic. Thus the 
maximum likelihood statistic gives an improvement in 
sensitivity to backgrounds composed of roughly 10 to 10"^ 
events per year. 

Figure |21 is a similar plot, without the Monte Carlo 
simulation results, for N = 10^ (corresponding to ground 
based detectors). Here we use Pfa* = Pfd* — 0.01. The 
results are similar to those in Fig.Q] except that here the 
gain in sensitivity occurs in the band 10~^ < £, < 10~^. 
This band corresponds to lO^'-lO^ events per year. 



D. Parameter estimation 




0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 

FIG. 11: Representative contours of In A(^, a'^, (Ti, (t|). Here 
p = 20 and = 1.6 x 10^. The simulated signal is character- 
ized by ^ = 0.2 and = 0.25, marked with an x. The noise 
is characterized by cti = = 1. The maximum, marked with 
a is found at In A(0.207, 0.251, 0.993, 0.993) = 229, while 
lnA(0.2,0.25, 1,1) = 227. 



V. CONCLUSIONS 

The use of our maximum likelihood statistic in searches 
for a non-Gaussian background gives a gain in sensitiv- 
ity over the standard cross-correlation statistic. Figures^ 
and|21show that the gain factor can be significant for suf- 
ficiently non-Gaussian signals. However, computing the 
maximum likelihood statistic requires significantly more 
computational power than the cross-correlation statistic. 

The analysis presented here must be generalized in sev- 
eral ways before being usable in gravitational wave detec- 
tors. These generalizations, listed in order of importance, 
are: 



The computation of the maximum likelihood statistic 
also serves to measure the parameters of the signal. The 
statistic A|^Li from Eq. 13. 2211 . can be written as 

^ML — max max max max , , , a^) ■ (4.24) 

0<5<1 a2>0 CT?>0 o-|>0 

The point (^, o;^, a^, ) where this maximum is achieved 
is the maximum likelihood estimator for (^, a^, CTj , (T2)- 
In Fig. ^2 we show contours of the function In A for a 
strong (jO = 20) signal. This figure shows that both 
and can be measured with good accuracy. 

Note that the main benefit of using is that it 

allows us to detect signals that are too weak to be seen 
using Acc • Using also allows one to test if a detected 
signal is Gaussian, as obtained above, but this is not the 
main benefit of the method, as there are other, simpler, 
methods to test for non-Gaussianity. 



• Our signal model H3.19|l assumes a Gaussian dis- 
tribution of amplitudes of the burst events. This 
assumption simplified our analysis and resulted in 
a statistic with the useful property of being nearly 
equivalent to the cross-correlation statistic in the 
Gaussian signal limit. In practice however, the dis- 
tribution of the events should instead be based on 
the candidate sources. For example, a popcorn-like 
stochastic background produced by a spatially uni- 
form distribution of standard-candle sources out to 
some maximum redshift would have a signal distri- 
bution of the form H3.19|l with the Gaussian term 
replaced by a term proportional to s^'^9{s — Smin), 
where 9 is the step function and Smin is a cutoff 
signal strength. 

• One should allow the burst durations to be longer 
than the detector resolution time. For this situa- 
tion one possibility would be to preprocess the data 
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with a lowpass filter, and then apply the techniques 
developed here. Another possibility would be to try 
to combine the analysis of this paper with the ex- 
cess power detection method of Ref. . 

• Real detector noise always contains non-Gaussian 
components, so one needs to generalize the analysis 
to allow for this. Such a generalization for a Gaus- 
sian stochastic background can be found in Refs. 

MM. 



• It would be useful to consider a more general signal 
model which consists of a superposition of a Gaus- 
sian background and a non-Gaussian background, 
since the true gravitational wave background might 
consist of such a superposition. 

• The analysis needs to be generalized to allow for 
colored detector noise, and separated, misaligned 
detectors. This generalization should be fairly 
straightforward. 
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APPENDIX A: GENERAL FORM OF THE 
LIKELIHOOD RATIO 

In this appendix we give two derivations of the general 
formula (|2.15(l for the likelihood ratio. The first deriva- 
tion is based on Eq. H2.13|l while the second is based on 
Eq. H2.14|l . We also derive the formula (|2.18|l for the 
posterior probability density py y ^j- (v^ , v„ 1 1 ) . 



1. First derivation 

We can derive Eq. 1)2.15(1 by using the total probabil- 
ity theorem to expand the distributions in the numerator 



and denominator of Eq. H2.13() . Note that all distribu- 
tions in this derivation are priors. 

First expand pnih) just in terms of the random vari- 
able T 

Pnih) = PT{l)Pn\T{h\l) + PT{0)pn\T{h\0). (Al) 
Expanding p-^ (h) in terms of all the degrees of freedom 
yields 



The ratio of the coefficients of iV(l) and -Pr(O) in 
Eq. l|A2p will give the general expression for the likeli- 
hood ratio by Eq. (|TT^ . 

The conditional distribution for H in Eq. l|A2p can be 
translated into a conditional distribution for J\f. From 
Eq. H2.1|l it follows that 



Pn\sih\s) = P^+s\sih\s) = PN-\s{h - 



(A3) 



and since S and J\f are statistically independent we ob- 
tain 

Pn\s{h\s) = P^{h~s). (A4) 

Generalizing this argument gives 

PH\T,v^,sy„{h\t,Vs,s,Vr,) =PAr\vJh- s\Vn), (A5) 

since a priori T, V^, and S are statistically independent 
of Af and V„. For the same reason we can write the joint 
distribution that appears in Eq. (|A2|) as 



Pr,v,,5,v„(i,Vs,s,v„) =pr,v,,5(^,Vs,s)pv„(v„). (A6) 
Substituting Eqs. (|X5|) and (|X6|) into Eq. HA2p yields 





I'd^^'s J d'^-Vn 







X PM\vJh - s|v„)pr,v,,5(i, Vs, s)pv„(v„). 
We can also rewrite the distribution pt.v .s{t,'Vs, s) as 

PT,V^,s{t,Vs,s) =_P5|V .T{sWs,t)pv^\T{-Vs,t)PT{t), 

(AS) 

by Eq. Substituting Eq. (|X8|) into Eq. (|J7l) and 

explicitly evaluating the sum over t yields 
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Pn(.h) = -Pr(l) / d'^'^Vs / d'^^s / d'^"w„ pa^|v (/i - s|v„)pv„(v„)p5|v ,t(s|vs, l)pv,|r(vs|l) 



(A9) 



Here we have used the foUowing relations: 

P5|v,.r(s|v, e 9,1,0) - (AlO) 

Pv.|r(vs e e,o|l) - (All) 

Pv.|r(vs e e,i|0) = (A12) 

rf''^^. Pv.ir(vslO) = 1. (A13) 



By comparing Eqs. IjAip and (|A9|) we can read off the 
distributions Pn\T{f^\t) ^^^d construct Eq. H2.15|l from 
Eq. H2.13|l . Note that the expression (|2.13|1 is indepen- 
dent of the space Qso of signal parameters corresponding 
to "no signal present" . 



Second derivation 



Here we derive Eq. H2.15|l . and also Eq. H2.18|l . from 
Eq. H2.14|l . Consider the distribution 



Pt,v,,v„|-h(1,Vs,v„|/i) 



Pt,v,,v„,h(1,Vs,v„,/i) 



Pn{h) 

(A14) 

We will justify Eq. (|2.15|) by the defining relation 
Eq. (|2.14|) . which explicitly refers to priors and posteriors. 
Therefore we now append the appropriate superscripts as 
bookkeeping devices. Eq. ljA14|) then reads 

Pt'v ,v„(1'Vs,v„) = — . (A15) 

Pn W 

Using the expansion of pn{h) given by Eq. (|A9|I . and 
what we will justify is the likelihood ratio A given by 
Eq. H2.15|l . we have 



Pt,v,,v„,h(1'"^s>'^">'') 



AP(") + 1 - P(") 

(A16) 

Expanding the uppermost numerator in Eq. ljA16(l over 
S by the total probability theorem gives 



(A17) 



^ -Pr,V,,V„,-H,5(l'Vs''^n,/l,s) 

and rewriting this gives 

.(0) ^^ „ h JO) 



Pr,v,,v„,H,5(l'V,'Vn>,s) =py^^{h-s\^fs) 



After putting Eq. (|A18|) into Eq. (|A17p . substitu te the 
result into Eq. l|A16p . Using A(vs, v„) given by Eq. (j2.17|l 
then yields 

On the left hand side of Eq. (|A19|) we have used 

Pr\,v„(l'V„v„) = P(i)pj;^)^^_^|^(v„v„|l). (A20) 

Integrate Eq. I|A19() over 9„ and 6si using Eq. (|2.16|l 
and the normalization requirement 



dQ^Vs d'2"v„p^')^ ,^(v,,v„|l) = l (A21) 



to get 



p(i) = 



Ap(" 



AP(o) + 1 - p(o) ■ 



(A22) 



Use Eq. HA22|I and Eq. (|A19|) to form the ratio on the 
left hand side of Eq. lfTTH|l . This justifies Eq. (|TTH|). 

Integrate Eq. (|2.18|) over 9„ and Qsi using Eq. (|2.16() 
and Ea.f lA2l|l to see that the defining relation Eq. (|2.14|l 
is satisfied and thus Eq. (|2.15|l is justified. 



APPENDIX B: ANALYTICAL EXPRESSIONS 
FOR FALSE DISMISSAL VERSUS FALSE 
ALARM CURVES FOR CROSS-CORRELATION 
STATISTIC 

This appendix derives the analytical form 1)4. 6|l of the 
false dismissal versus false alarm curves for the cross- 
correlation statistic Acc in the large N limit, for both 
Gaussian and non-Gaussian signals. A derivation for 
Gaussian signals can be found in Sec. IV of Ref. pol |. 

As noted in Sec. IIII Bl the statistics Acc and are 
equivalent in the large N limit. Thus, in this limit, the 
false dismissal versus false alarm curves can be found 
by evaluating Eqs. (|4.4|) and (|4.5|l with F replaced by 
a^. The relation (|3.11|) between the statistics and 
implies the following relation between their probability 
distributions Pa2\-r{x\t) and Pa^\T{x\t)' 

P&^\Tix\t) 0{x)pa2lT{x\t) + 6{x) / dyPa2lTiy\t). 

J —OC 

(Bl) 
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Inserting this formula into Eqs. (|4.4() and H4.5() gives 




(B2) 



da; pa2|7-(a;|0) if > 

if (52 < 

da; Pa2|7-(x|l) if > 

if d* < 

(B3) 

In the large N limit, the distribution Pa^\T{x\t) must 
be Gaussian by the central limit theorem, and therefore 
this distribution is characterized entirely by its mean 
(dj) and variance [A(dj)]2. From Eqs. (12.1(1 . H3.3(l . 





CT1CT2 



(|XTU|) . llTTT|l and (jXT^ . these are given by 

A(dg) 

(«?) 
A(d?) 



iV 



(B4) 
(B5) 

(B6) 
(B7) 



Substituting Gaussian distributions, with means and 
variances determined by Eqs. HB4f) - (|lj7f) . into Eqs. HB2f) 
and ||B3|) yields 



-PFA(d^f7i,(T2,A^) 



— erfc ( W — 
2 \^ai(72 V 2 ^ 

1 



if d^ > 
if d2 < 



(B8) 



1 erfc 

2 



2[ea4(3-0+e«^(a2 + ^l)+^?^i] 



if dj > 
if d? < 



(B9) 



If we now eliminate d^ between Eqs. ((B8|l and ((B9|l . 
change variables from a to p using Eq. I|3.27|l . and set 
CTi = (72, we obtain Eq. (|4.6|l . 



APPENDIX C: ASYMPTOTIC BEHAVIOR OF 
MAXIMUM LIKELIHOOD STATISTIC 

In this appendix we derive the large- A'^ behavior of the 
maximum likelihood statistic A.^^- From Eq. (|3.22|) . we 
can write the statistic in the form 

AZt{h)^e^p[NC{h)] (CI) 

with 

Cih) = max g{ai,a2,S,,ot,h) (C2) 

where 

1 ^ 

= X! 5'=('^i''^2,C,a), (C3) 
fc=i 

and the function — gk{o'i, (^2, Q;) is given by 

e^^" = ^Ak{a) + {1 - OAkiO) (C4) 



with 




We denote by cti, (72, ^ and d the "true" parameters 
governing the distribution of the quantities hi and 
according to Eqs. (EHJ, (EiH), ESJ, and (pHl^ . with 
untilded quantities replaced by the corresponding tilded 
quantities. [These "true parameters" were denoted by (Ji, 
(T2, ^ and a in the body of the paper.] We define p to be 
the signal-to-noise ratio H3.27|l with untilded quantities 
replaced by tilded quantities: 

p= ^ : ■ (C6) 

(Tl(72 

For simplicity, in this appendix we restrict attention to 
the case ai —02- Then, without loss of generality, we 
can take ai = (72 = 1 by rescaling our units of strain 
amplitude. 

We discuss separately the computation of the false 
alarm and false dismissal probabilities, as different tech- 
niques are required to compute each. 
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1. False dismissal probability 

The false dismissal probability for the statistic (jC2|l 
will be some function 

Pfd = PFD(CiV,e,p) (C7) 

of the threshold on £, the number of data points N , 
the Gaussianity parameter ^ and signal-to-noise ratio p of 
the signal. For apphcations to ground based detectors, 
we will have p ~ (a few), in order that the signal be 
detectable, N ~ 10^, and lO'^ < | < 1. Therefore it 
would be useful to find approximate analytic expressions 
for the false alarm probability in the limit of large N . 
There are actually several different, large N regimes in 
the three dimensional parameter space (A^, ^, p) that one 
might explore: 

• The limit N oo with a and held fixed. This 
corresponds to fixing the stochastic background sig- 
nal and going to a limit of long observation times. 
In this limit we have p cx ViV which diverges. This 
is not a very realistic limit to explore. 

• The limit N ^ oo with p and ^ held fixed. In this 
limit, the signal-to-noise ratio is held fixed, and 
correspondingly the amplitude a of the stochas- 
tic background signal goes to zero, from Eq. ljC6p . 
This would be the most natural limit to explore. 
However, in this limit the statistical error in our 
measurement of the Gaussianity parameter would 
diverge, from Eq. H4.18|l . and therefore in this limit 
we do not expect to be able to compute analytically 
the value of the parameter ^ which achieves the 
maximum in Eq. I|G2|I . The analytic approxima- 
tion methods which we discuss below do not work 
in this regime. [In addition our Monte Carlo simu- 
lations show that the maximum likelihood statistic 
itself does not perform any better than the cross- 
correlation statistic in this regime, as discussed in 
the Introduction.] 

• The limit we actually explore is the limit N oo 
with ^ fixed and p scaling oc iV^/**, corresponding to 
a <x 7V~^/*. The reason for our choosing to explore 
this particular limit is simply that it is amenable 
to analytic computations. Fractional corrections to 
our analytic results should scale like 1/iV or as 1/p^. 
Since p ^ {& few) at the threshold for detection, the 
approximation should be good to 10% — 20% or so. 

We now turn to a discussion of the computational tech- 
nique. We write 

a = aoN-^/^, (C8) 

where ao is independent of N. Correspondingly, from 
Eq. H3.19|l we can write 



where the distribution of s*^ is given by Eq. H3.19|l with 
^ replaced by ^ and a replaced by ao- In particular, the 
distribution of s*^ is independent of N. In computing 
the maximum over (^, a, cti, (72) in Eq. HC2|I . it is useful 
change variables from a to k defined by 

.^pN-^f^^^-^, (CIO) 
criO-2 

which we expect to be independent of N to leading order 
in the large N limit. The value of the variable k that 
characterizes the signal is 

= ^ (Cll) 

critT2 

cf. Eqs. ^ and lIHTOll . 

We now consider fixed realizations of the infinite se- 
quences of random variables rii, n\ and , and 1 < fc < 
CX), and examine the limiting behavior of C{h) as iV — s- 00. 
We compute this limiting behavior by substituting into 
the right hand side of Eq. ljCl|) the relations 

h\^n\ + N-^''^s'' hl^n^ + N-^/^s^, (C12) 

writing a in terms of k using Eq. IjClOp . and expanding 
in powers of N~'^/^. The result is an expression which 
can be written in terms of the sums Qabc defined by 

Qa^c = ]^Y.{^r{n\)\nl)\ (C13) 
fc=i 

where a, 6, and c are non-negative integers. From the 
central limit theorem we can write 

Qabc — P-abc H l=^abc, (d4) 

V N 

where pabc = {Qabc) are computable functions of ^ and 
a, and where the random variables (Aioo, Aqio, ■ • ■) con- 
verge in distribution ^ as — > 00 to a multivariate Gaus- 
sian of zero mean whose variance-covariance matrix is 
independent of N . Thus, in particular the joint distribu- 
tion of all Aabc's is iV-independent in limit that N ~* 00. 
We define the vector 

^={v\v\v\v^)^{^,K,al,cyl), (C15) 

We denote the value of v that achieves the maximum in 
Eq. inSl by v: 

g(v) = max 5(v), (C16) 



See chapter 8 of Ref. |29| for definitions of different notions of 
convergence for sequences of random variables. 
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where v = k, ct^, (t|). These estimators satisfy a sys- 
tem of four equations 



and 



dg_ 

dv'- 



0. 



(C17) 



We solve Eq. IjClTp perturbatively. First assume that the 
estimators can be expanded in the form 



yl _ yl gi ^ 

J=0 



(C18) 



where for ease of notation we have defined e = iV^^/®. 
We define the expansion coefficients u'l-'I analogously by 
an expansion of the form (|C18p but without the hats. 
Now using Eq. (|C14|I the function g can be expanded as 
a power series in e whose coefficients are functions of , 

llabc, and Aabc- 



(C19) 



Substituting the expansions 1(^18)1 and (|nT9)l into the 
condition IjClTp for a local extremum gives an infinite 
set of equations which must collectively be satisfied by 



the coefficients 



Qyllk] 



(C20) 



We solve these equations order by order to determine 

the coefficients , and thereby justify a posteriori the 
ansatz HC18|) . 

We find that in order to compute the leading order 
expression for C, wc must obtain the expansion for ^ to 
zeroth order in e, the expansion for k to fourth order in 
e, and the expansions of and (t| to sixth order in e. 
The leading order results are 



K = 

I _ 

i = 

3 _ 



Y 



l + 0(e2), 
1 + 0(6^), 



0{e% 
+ 0(e), 



where 



X 



iOll 



(C21) 
(C22) 

(C23) 
(C24) 

(C25) 



Y = 



1 



4(Ao3i + A013) - 12(Aoo2 + A020) 



-24Aoii + Ao4o + Aoo4 + 6A022 



(C26) 



Using Eqs. (EHJ, ^J^, (1^13)1 and (|UT4|l one can show 
that the random variables X and Y are independent 
Gaussian random variables of zero mean and unit vari- 
ance. 

In deriving Eqs. HC2ip - HC24|I we assumed that the 
value of V which achieves the maximum in Eq. HC2|I cor- 
responds a local maximum. However, if the right hand 
side of Eq. ljC21|) is negative, the maximum will instead 
be achieved on the boundary of the parameter space at 
K = 0, since the variable k must be non-negative. Sim- 
ilarly, if the right hand side of Eq. (|C22|) is less than 1, 
the maximum will be achieved at ^ = 1, since 1/^ must 
lie in the interval [1, 00). 

Substituting the results (|C21|I - (|C24|I [together with 
the higher order corrections to those results which we 
have not shown] into the expansion for the statistic £, 
and taking into account the various special cases dis- 
cussed in the last paragraph, gives 



C 



+ -{k + €^X f€^ - K-e" 



-kUe' + kVe^ 



e{k + e^X) + 0(6^). (C27) 



Here 9(x) is the step function and 



q 

U 
V 



1, 



1 

I 
Aioi 

A200 



Ai 



10, 



2'«(Aoo2 



(C28) 
(C29) 
2kAoii. (C30) 



We note that the corresponding expression for the 
statistic (lnA^L)/iV [which is equivalent to the cross- 
correlation statistic by Eq. (|3.13l) ] is given by Eq. I|C27|) 
with the first term in the square brackets dropped. 

Next we drop all the terms in the square bracket in 
Eq. ljC27|) other than the first two terms. The reason 
is that these terms will give corrections that are smaller 
than the terms retained (both in expected value and in 
fluctuations) by a factor of 



Here we are assuming that the maximum is achieved as a local 
maximum in the interior of the 4 dimensional parameter space. 
Cases when the maximum is achieved on the boundary are dis- 
cussed below. 




which will be small compared to unity for all cases we are 
interested in. This gives for the false dismissal probability 
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the expression 

Pfd = P{L<L.,) 
dxdy 



n 



27r 



■ exp 



where 



(C31) 



The integrand in (jC39|l peaks at r cos 9 = xo,r sin 6 = yo. 
In order for PpD to be small, its necessary that this peak 
occurs outside the domain of integration, at r > tq. So 
we must have 



2 , 2 \ 2 

Xo+yo> Tq. 



(C40) 



- / 2 

Xq = K/e 
2/0 



ro = 



qn 



(C32) 
(C33) 
(C34) 



Here the region TZ in the x, y plane is the union of the 
two regions 



and 



X > 
y > 
x^ + y^ < rl 



V < 
< X < ro. 



The integral over the region (|C36|I is 



4d = V{-y^)[V{r^ - xo) - V{-xo)l 



(C35) 



(C36) 



(C37) 



where 



1 /"^ 1 

V{x) = 1-7: erfc(x/\/2) = / dt—= exp[-iV2]- 

2 J-00 v27r 



(C38) 



The integral over the region l|C35|) can be written as 



4d ^ TT d0 drr 
27r Jo Jo 



X exp 



-i(rcos6'- xo)^ - ^{rsinO ~ yof 



(C39) 



The criterion Xq > tq is, in order of magnitude, just the 
usual criterion for detectability with the cross-correlation 
statistic. The criterion j/o ^ ''0 reduces to, in order of 
magnitude. 



N 



(C41) 



which is what we claimed earlier to be the regime where 
the maximum likelihood statistic starts to work well, cf. 
Sec. ITV A 31 above. 

Evaluating the integral l)C39|l using the Laplace ap- 
proximation gives 



P. 



(2) 



FD 



ro(A- l)V27rA 



1 + 1 — 



exp 



(C42) 



where we define the variables A and 7 by 



ixo,yo) = roA(cos7,sin7). 



(C43) 



However, the result (IC42II is not very accurate for small 
tq. Alternatively we can integrate over r in Eq. l|C39p to 
obtain 



(2) 



FD 



V2 r -, .o=^(i + a2) 

del — e — 2 — 



2tt 



e 2 



o^o A cos(7-e) 



+ 



foA la 



2V27r 



^[cos(2{7-0})-l] cos (7-0) 



erf 



ro A cos{7 — 9} 
V2 



erf 



ro {l-Acos[7-g]} 
V2 



(C44) 



where The integral 1C44() can be evaluated numerically. The 

false dismissal probability is then given by 



erf (x) ^ — I dye-y . (C45) ^ ^(i) ^ ^ ^ ^^^g^ 
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with Pfjl given by Eq. ifUSTl) and Pfj^ given by Eq. 
"]4l. 



Cumulative distribution function 



2. False alarm probability 



The false alarm probability is some function 



Pfa = Pfa{C,,N) 



(C47) 



of the threshold £* value of the detection statistic (|C2|) 
and of the number of data points N. It does not depend 
on the signal parameters p and ^ because no signal is 
present. We would like to evaluate this quantity in the 
large N limit. 

We start by rewriting the statistic ljC2|l in the form 

f 1 ^ 1 ^ 1 

^ { k=l k=l J 



where 



iyt ck = — — — 1. 



(C48) 



(C49) 



A 



O 



-1 - 




FIG. 12: The cumulative distribution function for the lead- 
ing order expression IIC521 for the statistic when no signal is 
present, obtained numerically. The solid line is for A'' — 1000, 
and the dashed line for A'' = 5000. 



Consider first the first term in Eq. ljC48|) . Using the 
definition (|C5|I of Ak{a) and the definition (|1.4() of (Ti 
and (72 we can write this term as 



A 2 A 2 

1 Y: In A,(0) = - ^ + 0{Aal,Aal) 



(C50) 



k=l 



where Acri = ci — tJi , Aa2 — a2~&2- Therefore the first 
term is maximized at cri = ai, 12 ~ Below we shall 
show that the second term in Eq. (|(y48p is of order O(e^), 
where in this subsection we define e — 1/^/N. Therefore 
the values of cti and 172 that achieve the maximum are 



ai = ai [1 + 0(6')] 
(T2 = ^2 [1 + 0{e^)] 



(C51) 



Moreover, in analyzing the second term it suffices to take 
tJi = CTi, (72 = ^2 in order to obtain the statistic to the 
leading O(e') order. Lastly, since we have assumed that 
CTi = 0-2 = 1 and no signal is present, we have tTi,2 = 1 + 
0(e). Hence, in analyzing the second term, it is sufficient 
to take (Ti = (72 = 1- 

The statistic (IC48II therefore reduces to 



(C52) 



1 ^ 

£ = max - V In [1 + ^Vk{a)] + 0(e), 

a, 5 iV ^ — ' 



fe=l 



where from Eqs. (|C5(I and l|C49|l 



1 



: exp 



(C53) 



Here Wk = (rtf + n2)/V2, ^ < k < N, are independent 
Gaussian random variables of zero mean and unit vari- 
ance. 

It is straightforward to numerically compute the distri- 
bution of the statistic (|C52|) , by generating the Gaussian 
variables Wk and numerically maximizing over ^ and a. 
The result is shown in Fig. E| We find that at large N, 
the distribution of NC becomes independent of N, and 
is approximately given by 



P{NC > aoe 



(C54) 



for ^ > 0, where ao ~ 0.42 and f3o ~ 1.08. Therefore the 
false alarm probability is approximately given by 



Pfa = ao exp [-(3oNC4 ■ 



(C55) 



Finally, we remark why it is plausible to expect the 
distribution of NC to be independent of N in the large N 
limit. The numerical maximizations over ^ and a in Eq. 
(|C52|I show that the maximum is nearly always achieved 
at q; ^ 1 or ^ ^ 1. In both these regimes, one can 
obtain some information about the A''-dependence of the 
statistic. 

Consider first the regime ^ <C 1. In this regime we 
can expand the expression HC52|I as a power series in ^ 
to obtain 



1 



N 

max — > 

k=l 



1 



^Vk{a)^-eVk{ar+0{e) 



-0(e). 
(C56) 
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The generalized central limit theorem (reviewed in Ap- 
pendix implies that 



1 ^ 1- 
- J2 = (In^)'' ^Nia), 



(C57) 



fc=i 



where for each fixed a, the distribution of the random 
variable !Fn{oi) becomes independent of N in the large 
N limit. Here 



and 



71 



< a < 1/2 



1/2 < a 







< a < 1/2 
TTt 1/2 < 



(C58) 



(C59) 



The limiting distribution is a Levy distribution with pa- 
rameters p = 1 and 7 = 7i . Similarly we have 



fc=i 



N— {InNY' QN{a), 



(C60) 



where as — > cxd at each fixed a the distribution of the 
random variable Q n (a) tends to a Levy distribution with 
parameters p = 1 and 7 = 72 , with 



J 2 < a < 1/6 



and 



<52 = 



0<a<l/6 

1/6 < a 
1+2q -l/ _ 



(C61) 



(C62) 



We now substitute the results ljC57l) and (|C60|) into the 
expression (IC56II for the statistic, and maximize analyt- 
ically over the quadratic dependence on ^. For a > 1/2, 
the value of ^ which achieves the maximum goes to zero 
as -/V — > 00, consistent with the assumption ^ ^ 1, and 
the result is 



(C63) 



2 a yN(a) 



In the regime a <C 1, if we expand the expression ljC52() 
to quadratic order in a, the result is an expression which 
is a linear function of 1/^ at fixed a^. Hence, when one 
maximizes over values of ^ in the range < ^ < 1, the 
maximum is always achieved either at ^ = or ^ = 1. 



One can show that the maximum to this order is always 
achieved at ^ = 1, and the resulting expression is 



N£ = ^g^ + 0{e), 



where 



1 ^ 
N ^ 



k=l 



(C64) 



(C65) 



has a distribution that is independent of N in the large 
N hmit. 



APPENDIX D: GENERALIZED CENTRAL 
LIMIT THEOREM 

In this appendix we review the generalized central limit 
theorem that can be found on p. 574 of Ref. ■ First we 
define a particular distribution function called the Levy 
distribution. It depends on 3 real parameters, a positive 
constant C, a parameter 7 in the range < 7 < 2, and 
constant p in the range < p < 1 We say a random 
variable X has a Levy distribution with parameters C, 7 
and p if the characteristic function of X is given by 



) = exp Id 



- CT(3-7) 
7(7- 1) 



cos(7r7/2) 



-zsgn(C)(p- g) sin(7r7/2) 



(Dl) 



where q — 1—p. The corresponding probability distribu- 
tion function is obtained by taking a Fourier transform 
and decays like x~^^'^''^ at large a; for 7 < 2 (7 = 2 is the 
Gaussian case). 

Consider now a random variable X with probability 
distribution function f{x) whose variance is infinite. Let 



F{x)^ / dyfiy) (D2) 

^ — 00 

be the cumulative distribution function and define 

fi{x) = I dyy'fiy). (D3) 



Suppose that the distribution satisfies the following con- 
ditions: (i) As a; ^ 00 we have fJ,{x) ^ x'^~'^ L{x), where 
< 7 < 2, and L{x) varies slowly in the sense that 
L{tx)/L{t) ^ 1 as t ^ 00 for all x > Q. (ii) We have 



1 - F{x) 



F( 



F{-x) + 1 - F{x) 



F{-x) + 1 - F{x) 



(D4) 



For a < 1/2 this argument fails, which is why we must nu- 
merically verify that the distribution of NC is asymptotically The parameter 7 is conventionally denoted by a. We use 7 here 
independent of A'^. to avoid confusion with the variable a defined in Eq. 11.71 . 
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as X —I- oo, where 0<p<l, 0<(7<1 and p + q = 1. 
(iii) For 1 < 7 < 2, we assume that the expected value 
Jdxxf{x) vanishes; this can be enforced by making a 
transformation of the form X X + constant. 
We define the sequence of random variables 



Sn — 



1 ^ 

2 — 1 



(D5) 



where the Xi are independent, identically distributed 



random variables with distribution function /, and the 
constants are chosen to satisfy 



Nn{aN) 



C 



(D6) 



''N 



as ^ 00, where C is a positive constant. Then, the 
distribution functions of the random variables Sn con- 
verge to a Levy distribution with parameters C, 7 and p 
as — > 00. 
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